Goto

Collaborating Authors

 Troms


A Decomposable Forward Process in Diffusion Models for Time-Series Forecasting

Caldas, Francisco, Kumar, Sahil, Soares, Cláudia

arXiv.org Machine Learning

We introduce a model-agnostic forward diffusion process for time-series forecasting that decomposes signals into spectral components, preserving structured temporal patterns such as seasonality more effectively than standard diffusion. Unlike prior work that modifies the network architecture or diffuses directly in the frequency domain, our proposed method alters only the diffusion process itself, making it compatible with existing diffusion backbones (e.g., DiffWave, TimeGrad, CSDI). By staging noise injection according to component energy, it maintains high signal-to-noise ratios for dominant frequencies throughout the diffusion trajectory, thereby improving the recoverability of long-term patterns. This strategy enables the model to maintain the signal structure for a longer period in the forward process, leading to improved forecast quality. Across standard forecasting benchmarks, we show that applying spectral decomposition strategies, such as the Fourier or Wavelet transform, consistently improves upon diffusion models using the baseline forward process, with negligible computational overhead. The code for this paper is available at https://anonymous.4open.science/r/D-FDP-4A29.


CHEM: Estimating and Understanding Hallucinations in Deep Learning for Image Processing

Li, Jianfei, Rosellon-Inclan, Ines, Kutyniok, Gitta, Starck, Jean-Luc

arXiv.org Artificial Intelligence

U-Net and other U-shaped architectures have achieved significant success in image deconvolution tasks. However, challenges have emerged, as these methods might generate unrealistic artifacts or hallucinations, which can interfere with analysis in safety-critical scenarios. This paper introduces a novel approach for quantifying and comprehending hallucination artifacts to ensure trustworthy computer vision models. Our method, termed the Conformal Hallucination Estimation Metric (CHEM), is applicable to any image reconstruction model, enabling efficient identification and quantification of hallucination artifacts. It offers two key advantages: it leverages wavelet and shearlet representations to efficiently extract hallucinations of image features and uses conformalized quantile regression to assess hallucination levels in a distribution-free manner . Furthermore, from an approximation theoretical perspective, we explore the reasons why U-shaped networks are prone to hallucinations. W e test the proposed approach on the CANDELS astronomical image dataset with models such as U-Net, Swin-UNet, and Learnlets, and provide new perspectives on hallucination from different aspects in deep learning-based image processing.


In Search of Adam's Secret Sauce

Orvieto, Antonio, Gower, Robert M.

arXiv.org Artificial Intelligence

Understanding the remarkable efficacy of Adam when training transformer-based language models has become a central research topic within the optimization community. To gain deeper insights, several simplifications of Adam have been proposed, such as the signed gradient and signed momentum methods. In this work, we conduct an extensive empirical study - training over 1500 language models across different data configurations and scales - comparing Adam to several known simplified variants. We find that signed momentum methods are faster than SGD, but consistently underperform relative to Adam, even after careful tuning of momentum, clipping setting and learning rates. However, our analysis reveals a compelling option that preserves near-optimal performance while allowing for new insightful reformulations: constraining the Adam momentum parameters to be equal, beta1 = beta2. Beyond robust performance, this choice affords new theoretical insights, highlights the "secret sauce" on top of signed momentum, and grants a precise statistical interpretation: we show that Adam in this setting implements a natural online algorithm for estimating the mean and variance of gradients-one that arises from a mean-field Gaussian variational inference perspective.


VAIN: Attentional Multi-agent Predictive Modeling

Yedid Hoshen

Neural Information Processing Systems

One of the drawbacks of INs is scaling with the number of interactions in the system (typically quadratic or higher order in the number of agents). In this paper we introduce V AIN, a novel attentional architecture for multi-agent predictive modeling that scales linearly with the number of agents. We show that V AIN is effective for multi-agent predictive modeling.



Bringing Federated Learning to Space

Kim, Grace, Svoboda, Filip, Lane, Nicholas

arXiv.org Artificial Intelligence

Abstract-- As Low Earth Orbit (LEO) satellite constellations rapidly expand to hundreds and thousands of spacecraft, the need for distributed on-board machine learning becomes critical to address downlink bandwidth limitations. Federated learning (FL) offers a promising framework to conduct collaborative model training across satellite networks. Realizing its benefits in space naturally requires addressing space-specific constraints, from intermittent connectivity to dynamics imposed by orbital motion. This work presents the first systematic feasibility analysis of adapting off-the-shelf FL algorithms for satellite constellation deployment. We introduce a comprehensive "space-ification" framework that adapts terrestrial algorithms (FedA vg, FedProx, FedBuff) to operate under orbital constraints, producing an orbital-ready suite of FL algorithms. We then evaluate these space-ified methods through extensive parameter sweeps across 768 constellation configurations that vary cluster sizes (1-10), satellites per cluster (1-10), and ground station networks (1-13). Our analysis demonstrates that space-adapted FL algorithms efficiently scale to constellations of up to 100 satellites, achieving performance close to the centralized ideal. Multi-month training cycles can be reduced to days, corresponding to a 9X speedup through orbital scheduling and local coordination within satellite clusters. These results provide actionable insights for future mission designers, enabling distributed on-board learning for more autonomous, resilient, and data-driven satellite operations. Low Earth Orbit (LEO) satellite constellations are expanding rapidly, supporting applications in Earth observation (EO), telecommunications, and navigation. Large-scale constellations such as Planet Labs' Dove fleet, SpaceX's Starlink, and Amazon's Project Kuiper already consist of hundreds to thousands of spacecraft, representing some of the largest distributed systems ever deployed. This unprecedented scale is driving a dramatic increase in the volume and diversity of space-based data. Earth observation missions in particular bear the brunt of this data challenge. High-resolution missions such as Landsat-8 produce 1.8 GB per scene and more than 400 TB annually [1]. At constellation scale, Planet Labs' fleet of over 200 satellites generates terabytes of imagery each day [2].




Deep Learning-Driven Downscaling for Climate Risk Assessment of Projected Temperature Extremes in the Nordic Region

Loganathan, Parthiban, Zea, Elias, Vinuesa, Ricardo, Otero, Evelyn

arXiv.org Artificial Intelligence

Rapid changes and increasing climatic variability across the widely varied Koppen-Geiger regions of northern Europe generate significant needs for adaptation. Regional planning needs high-resolution projected temperatures. This work presents an integrative downscaling framework that incorporates Vision Transformer (ViT), Convolutional Long Short-Term Memory (ConvLSTM), and Geospatial Spatiotemporal Transformer with Attention and Imbalance-Aware Network (GeoStaNet) models. The framework is evaluated with a multicriteria decision system, Deep Learning-TOPSIS (DL-TOPSIS), for ten strategically chosen meteorological stations encompassing the temperate oceanic (Cfb), subpolar oceanic (Cfc), warm-summer continental (Dfb), and subarctic (Dfc) climate regions. Norwegian Earth System Model (NorESM2-LM) Coupled Model Intercomparison Project Phase 6 (CMIP6) outputs were bias-corrected during the 1951-2014 period and subsequently validated against earlier observations of day-to-day temperature metrics and diurnal range statistics. The ViT showed improved performance (Root Mean Squared Error (RMSE): 1.01 degrees C; R^2: 0.92), allowing for production of credible downscaled projections. Under the SSP5-8.5 scenario, the Dfc and Dfb climate zones are projected to warm by 4.8 degrees C and 3.9 degrees C, respectively, by 2100, with expansion in the diurnal temperature range by more than 1.5 degrees C. The Time of Emergence signal first appears in subarctic winter seasons (Dfc: approximately 2032), signifying an urgent need for adaptation measures. The presented framework offers station-based, high-resolution estimates of uncertainties and extremes, with direct uses for adaptation policy over high-latitude regions with fast environmental change.


MvHo-IB: Multi-View Higher-Order Information Bottleneck for Brain Disorder Diagnosis

Zhang, Kunyu, Li, Qiang, Yu, Shujian

arXiv.org Artificial Intelligence

Recent evidence suggests that modeling higher-order interactions (HOIs) in functional magnetic resonance imaging (fMRI) data can enhance the diagnostic accuracy of machine learning systems. However, effectively extracting and leveraging HOIs remains a significant challenge. In this paper, we propose MvHo-IB, a novel multi-view learning framework that seamlessly integrates pairwise interactions and HOIs for diagnostic decision-making while automatically compressing task-irrelevant redundant information. Our approach introduces several key innovations: (1) a principled framework combining O -information from information theory with the recently developed matrix-based Rényi's α - order entropy functional estimator to quantify and extract HOIs, (2) a purpose-built Brain3DCNN encoder designed to effectively utilize these interactions, and (3) a novel multiview learning information bottleneck objective to enhance representation learning. Experiments on three benchmark fMRI datasets demonstrate that MvHo-IB achieves state-of-the-art performance, outperforming existing methods, including modern hypergraph-based techniques, by significant margins. The code of our MvHo-IB is available at https://github.com/zky04/MvHo-IB .